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Abstract. Inference for causal effects can benefit from the availability 
of an instrumental variable (IV) which, by definition, is associated with 
the given exposure, but not with the outcome of interest other than 
through a causal exposure effect. Estimation methods for instrumental 
variables are now well established for continuous outcomes, but much 
less so for dichotomous outcomes. In this article we review IV estima- 
tion of so-called conditional causal odds ratios which express the effect 
of an arbitrary exposure on a dichotomous outcome conditional on the 
exposure level, instrumental variable and measured covariates. In addi- 
tion, we propose IV estimators of so-called marginal causal odds ratios 
which express the effect of an arbitrary exposure on a dichotomous 
outcome at the population level, and are therefore of greater public 
health relevance. We explore interconnections between the different es- 
timators and support the results with extensive simulation studies and 
three applications. 

Key words and phrases: Causal effect, causal odds ratio, instrumental 
variable, marginal effect, Mendelian randomization, logistic structural 
mean model. 



1. INTRODUCTION 

Most causal analyses of observational data rely 
heavily on the untestable assumption of no unmea- 
sured confounders. According to this assumption, 
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one has available all prognostic factors of the expo- 
sure that are also associated with the outcome other 
than via a possible exposure effect on outcome. Con- 
cerns about the validity of this assumption plague 
observational data analyses and increase the uncer- 
tainty surrounding many study results (Greenland, 
2005). This is especially true in settings where the 
data analysis is based on registry data or focuses 
on research questions different from those conceived 
at the time of data collection. Substantial progress 
can sometimes be made in settings where measure- 
ments are available on a so-called instrumental vari- 
able (IV). This is a prognostic factor of the exposure, 
which is not associated with the outcome, except 
via a possible exposure effect on outcome (Angrist, 
1990; McClellan and Newhouse, 1994; Angrist, Im- 
bens and Rubin, 1996; Hernan and Robins, 2006). 
An instrumental variable Z for the effect of expo- 
sure X on outcome Y thus satisfies the following 
properties: (a) Z is associated with X; (b) Z affects 
the outcome Y only through X (i.e., often referred 
to as the exclusion restriction); (c) the association 
between Z and Y is unconfounded (i.e., often re- 
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ferred to as the randomization assumption) (Hernan 
and Robins, 2006). For instance, in the data analysis 
section, we will estimate the effect of Cox-2 treat- 
ment (versus nonselective NSAIDs) on gastrointesti- 
nal bleeding, thereby allowing for the possibility of 
unmeasured variables U confounding the association 
between X and Y, by choosing the physician's pre- 
scribing preference for Cox-2 (versus nonselective 
NSAIDs) as an instrumental variable (Brookhart 
and Schneeweiss, 2007). Because this is associated 
with Cox-2 treatment [i.e., (a)], it would qualify as 
an IV if it were reasonable that the physician's pre- 
scribing preference can only affect a patient's gas- 
trointestinal bleeding through his/her prescription 
[i.e., (b)] and is not otherwise associated with that 
patient's gastrointestinal bleeding [i.e., (c)]. Assump- 
tion (b) could fail, however, if preferential prescrip- 
tion of Cox-2 were correlated with other treatment 
preferences that have their own impact on gastroin- 
testinal bleeding; the latter assumption could fail if 
patients with high risk of bleeding are more often 
seen with physicians who prefer Cox-2 (Hernan and 
Robins, 2006). In this article, we will more gener- 
ally assume that the instrumental variables assump- 
tions (a), (b) and (c) hold conditional on a (possibly 
empty) set of measured covariates C. 

IVs have a long tradition in econometrics and are 
becoming increasingly popular in biostatistics and 
epidemiology. This is partly because the plausibil- 
ity of a measured variable as an IV can sometimes 
be partially justified on the basis of the study de- 
sign or biological theory. For instance, in random- 
ized encouragement designs whereby, say, pregnant 
women who smoke are randomly assigned to inten- 
sified encouragement to quit smoking or not, ran- 
domization could qualify as an IV for assessing the 
effects of smoking on low birth weight (Permutt and 
Hebel, 1989), since it guarantees the validity of IV 
assumption (c). The growing success of IV meth- 
ods in biostatistics and epidemiology can, however, 
be mainly attributed to applications in genetic epi- 
demiology (Smith and Ebrahim, 2004). Here, the 
random assortment of genes transferred from par- 
ents to offspring resembles the use of randomization 
in experiments and is therefore often referred to as 
"Mendelian randomization" (Katan, 1986). Build- 
ing on this idea, genetic variants may sometimes 
qualify as an IV for estimating the relationship be- 
tween a genetically affected exposure and a disease 
outcome, although violations of the necessary con- 
ditions may occur (see Didelez and Sheehan, 2007, 
and Lawlor et al., 2008, for rigorous discussions). 



Estimation methods for IVs are now well estab- 
lished for continuous outcomes. The case of dichoto- 
mous outcomes has received more limited attention. 
It turns out to be much harder because of the need 
for additional modeling and because of difficulties to 
specify congenial model parameterizations (see Sec- 
tions 2.2 and 3). This paper therefore combines dif- 
ferent, scattered developments in the biostatistical, 
epidemiological and econometric literature and aims 
to improve the clarity and comparability of these de- 
velopments by casting them within a common causal 
language based on counterf actuals. 

Traditional econometric approaches have their 
roots in structural equations theory and have thereby 
largely focused on the estimation of conditional cau- 
sal effects, where rather than employing counterfac- 
tuals to define causal effects, conditioning is made 
on all common causes, U, of exposure X and out- 
come Y (see Blundell and Powell, 2003, for a re- 
view). By this conditioning, one can assign a causal 
interpretation to association measures such as 

odds(y = l|V = x + l,C, U) 
odds{Y = 1\X = x,C,U) ' 
This can be seen by noting that this odds ratio mea- 
sure can — under a consistency assumption that Y = 
Y(x) if X = x — equivalently be written as (Pearl, 
1995) 

odds{Y(:r + l) = l|C,[/} 
odds{Y(x) = l\C,U} ' 

where Y(x) denotes the (possibly) counterfactual 
outcome following an intervention setting X at the 
exposure level x and where for any V, W, odds(V7 = 
1|V) = V{W = 1|V)/P(W = 0|V). Effect measure (1) 
thus compares the odds of "success" if the exposu- 
re X were uniformly set to x + 1 versus x within stra- 
ta of C and U. Because U is unmeasured, these stra- 
ta are not identified, which makes (1) less appealing 
as an effect measure and of limited use for policy 
making. Its interpretation is especially hindered in 
view of noncollapsibility of the odds ratio (Green- 
land, Robins and Pearl, 1999), following which the 
magnitude of conditional odds ratios changes with 
the conditioning sets, even in the absence of con- 
founding or effect modification. Similar limitations 
are inherent to the so-called treatment effect on the 
treated at the IV level z of exposure x (Tan, 2010), 

odds{Y(x) = l\X(z)=x} 

^ ' odds{V(0) = l\X(z)=x} : 

and to so-called local or principal stratification causal 
odds ratios (Hirano et al., 2000; Frangakis and Ru- 
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bin 2002; Abadie, 2003; Clarke and Windmeyer, 2009; 
see Bowden et al., 2010, for a review). For a dichoto- 
mous instrumental variable Z and dichotomous ex- 
posure X taking values and 1, the latter measure 
the association between instrumental variable and 
outcome within the nonidentifiable principal stra- 
tum of subjects for whom an increase in the instru- 
mental variable induces an increase in the exposure; 
that is, 



(3) 



odds{y(i) = i|x(i) > x(o),c} 



odds{y(0) = 1\X(1) > X(0), C} ' 

Inference for principal stratification causal odds ra- 
tios is also more rigid in the sense of having no 
flexible extensions to more general settings involv- 
ing continuous instruments and exposures. While di- 
chotomization of the instrument and/or exposure is 
often employed in view of this, it not only implies 
a loss of information, but may also induce a viola- 
tion of the exclusion restriction and may make the 
relevance of the principal stratum ll X(l) > X(0)" 
become dubious (see Pearl, 2011, for further discus- 
sion of these issues). 

In view of the aforementioned limitations, our at- 
tention in this article will focus on causal effects 
which are defined within identifiable subsets of the 
population. Special attention will be given to the 
conditional causal odds ratio (Robins, 2000; Vanstee- 
landt and Goetghebeur, 2003; Robins and Rotnitzky, 
2004), which we define as 

odds(Y = 1\X, Z,C) 
odds{Y(0) = 1\X, Z, cy 

It expresses the effect of setting the exposure to zero 
within subgroups defined by the observed exposure 
level X, instrumental variables Z and covariates C. 
In the special case where X is a dichotomous treat- 
ment variable, taking the value 1 for treatment and 
for no treatment, (4) evaluated at X = 1, that is, 

odds{y(l) = l|X = l,Z, C} 
odds{Y(0) = l|X = 1,Z, C} 

is sometimes referred to as the treatment effect in 
the treated who are observed to have IV level Z 
(Hernan and Robins, 2006; Robins, VanderWeele and 
Richardson, 2006; Didelez, Meng and Sheehan, 2010; 
Tan, 2010). Conditional causal odds ratios would be 
of special interest if the goal of the study were to 
examine the impact of setting the exposure to zero 
for those with a given exposure level X, for example, 
to examine the impact of preventing nosocomial in- 



fection within those who acquired it (Vansteelandt 
et al., 2009). 

While the comparison in (4) could alternatively 
be expressed as a risk difference or relative risk, our 
focus throughout will be limited to odds ratios be- 
cause models for other association measures do not 
guarantee probabilities within the unit interval, and 
might not be applicable under case-control sampling 
(Bowden and Vansteelandt, 2011). We refer the in- 
terested reader to Robins (1994) and Mullahy (1997) 
for inference on the conditional relative risk 

P(Y = 1\X,Z,C) 

[) p{y(o) = i|x,z,c}' 

and to van der Laan, Hubbard and Jewell (2007) for 
inference on the so-called switch relative risk, which 
is defined as (5) for subjects with values (X, Z, C) 
for which P(Y = 1\X,Z,C) < P{Y(0) = 1\X,Z,C} 
and as 

P(Y = 0\X, Z,C) 

p{y(o) = o\x, z,cy 

for all remaining subjects. The latter causal effect 
parameter is more difficult to interpret, but has the 
advantage that models for the switch relative risk, 
unlike models for (5), guarantee probabilities within 
the unit interval. 

For policy making, the interest lies more usually 
in population-averaged or marginal effect measures 
(Greenland, 1987; Stock, 1988) such as 

, , odds{Y(x + !) = !} 

1 ' odds{Y(x) = l} ' 

where x is a user-specified reference level, or 

odds{V(X + l) = 1} 



(7) 



odds{Y(X) = 1} 
odds{Y(l.l x X) = 1} 



or 



odds{Y(X) = 1} ' 

Here, (6) evaluates the effect of changing the expo- 
sure from level x to x + 1 uniformly in the popula- 
tion. It thus reflects the effect that would have been 
estimated had an ideal randomized controlled trial 
(i.e., with 100% compliance) in fact been possible, 
randomizing subjects over exposure level x versus 
x + 1. In contrast, the effect measures in (7) allow for 
natural variation in the exposure between subjects 
by expressing the effect of an absolute or relative 
increase in the observed exposure. This may ulti- 
mately be of most interest in many observational 
studies, considering that many public health inter- 
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ventions would target a change in exposure level 
(e.g., diet, BMI, physical exercise, . . . ), starting from 
some natural, subject-specific exposure level X. 

We review estimation of the conditional causal 
odds ratio (4) in Section 2. By casting different de- 
velopments within the same causal framework based 
on counter f actuals, new insights into their intercon- 
nections will be developed. We propose novel estima- 
tors of the marginal causal odds ratios given in (6) 
and (7) in Section 3, as well as for the corresponding 
effect measures expressed as risk differences or rela- 
tive risks. Extensive simulation studies are reported 
in Section 4 and an evaluation on 3 data sets is given 
in Section 5. 

2. IV ESTIMATION OF THE CONDITIONAL 
CAUSAL ODDS RATIO 

Identification of the conditional causal odds ra- 
tio (4) is studied in detail in Robins and Rotnitzky 
(2004) and Vansteelandt and Goetghebeur (2005), 
who find that — as for other IV estimators (Hernan 
and Robins, 2006) — parametric restrictions are re- 
quired in addition to the standard instrumental vari- 
ables assumptions. In particular, nonlinear exposure 
effects and modification of the exposure effect by 
the instrumental variable are not nonparametrically 
identified. We will therefore consider estimation of 
the conditional causal odds ratio under so-called lo- 
gistic structural mean models (Robins, 2000; Van- 
steelandt and Goetghebeur, 2003; Robins and Rot- 
nitzky, 2004), which impose parametric restrictions 
on the conditional causal odds ratio (4). In partic- 
ular, these models postulate that the exposure ef- 
fect is linear in the exposure on the conditional log 
odds ratio scale, and independent of the instrumen- 
tal variable, in the sense that 



(8) 



odds(Y = 1|X, Z,C) 

odds{y(o) = T\x, z,c} 



exp{m(C;ij*)X}, 



where m{C]ip) is a known function (e.g., ipo + ip±C), 
smooth (i.e., with continuous first-order derivatives) 
in ip, and ip* is an unknown finite dimensional pa- 
rameter. In the absence of covariates, this gives rise 
to a relatively simple model of the form 



(9) 



odds(Y = 1\X,Z) 
odds{Y(0) = 1\X,Z} 



exp(V>*X) 



The assumption that the exposure effect is not mod- 
ified by the IV substitutes the monotonicity assump- 
tion [that X(z) > X(z') if z > z'\ (Hernan and Robins, 



2006) which is commonly adopted in the principal 
stratification approach. In spite of the randomiza- 
tion assumption [cf. IV assumption (c)], it may be 
violated because subjects with exposure level X are 
not exchangeable over levels of the IV, so that they 
might in particular experience different effects. The 
additional assumption of a linear exposure effect is 
only relevant for exposures that take on more than 
two levels. It must be cautiously interpreted because 
the conditional causal odds ratio (4) expresses ef- 
fects for differently exposed subgroups which may 
not be exchangeable. Both these assumptions are 
critical because they are empirically unverifiable. 
Vansteelandt and Goetghebeur (2005) assess the sen- 
sitivity of the conditional causal odds ratio estima- 
tor to violation of the linearity assumption and note 
that, under violation of the linearity assumption, the 
estimator can still yield a meaningful first order ap- 
proximation. In the remainder of this work, we will 
assume that model (8) is correctly specified. 

2.1 Approximate Estimation 

Approximate IV estimators of the conditional cau- 
sal odds ratio can be obtained by averaging over 
the observed exposure values in model (8) using the 
following approximations: 



(10) 



(11) 



E{logitE(Y|X, Z,C)\Z, C} 

elegit E(Y\Z,C), 
Eflogit E{Y(0)\X, Z, C} | Z, C] 

«logitE{V(0)|Z,C}. 



This together with the logistic structural mean mo- 
del (8) implies 

logitE(Y|Z,C) 

(12) ^\ogitE{Y(0)\Z,C}+m(C;tp*)E(X\Z,C) 

= logit E{ y (0) \C} + m(C;if;*)E(X\Z,C), 

upon noting that the combined IV assumptions (b) 
and (c), conditional on C, imply Y(x) _LL Z\C for 
all x. It follows that approximate IV estimators of 
the conditional causal odds ratio can be obtained 
via the following two-stage approach: 

1. Estimate the expected exposure in function of 
the IV and covariates by fitting an appropriate 
regression model. Let the predicted exposure be 
X = E(X\Z, C). 

2. Regress the outcome on covariates C and on m(C; 
ip)X through standard logistic regression to ob- 
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tain an estimate of ip*. In the absence of co- 
variates, this involves fitting a logistic regression 
model of the form 

(13) logitE(Y\Z) = co + ipX. 

When, furthermore, the IV is dichotomous, it fol- 
lows from (12) that 

odds(Y = l|Z = l) 



OR 



Y\Z 



(14) 



0) 



odds(Y = l|Z 

-exp(^*) Ax l z , 

where A X | Z = E(X\Z = 1) - E(X\Z = 0), so 

that ip* can be estimated as logORy\z/^x\z- 

The estimator obtained using the above two-stage 
approach is referred to as the standard IV estima- 
tor in Palmer et al. (2008), a Wald-type estimator in 
Didelez, Meng and Sheehan (2010) and the 2-stage 
logistic approach in Rassen et al. (2009). It is com- 
monly employed in the analysis of Mendelian ran- 
domization studies (Thompson et al., 2003; Palmer 
et al., 2008), where it is typically viewed as an ap- 
proximate estimator of the conditional causal odds 
ratio (1). Our alternative development shows that 
it can also be viewed as an approximate estimator 
of the conditional causal odds ratio (4) . To gain in- 
sight into the adequacy of the approximations (10) 
and (11), suppose for simplicity that there are no co- 
variates, that the exposure has a normal distribution 
with constant variance a 2 conditional on that 
logitE(F|X,Z) = p + /3 X X + /3 Z Z and that m(C; 
ip) = ip. Then it is easily shown, using results in 
Zeger and Liang (1988), that 

logitE(Y|Z) « /3 {(3 2 a 2 x } + p x {(3 2 a 2 x }E(X\Z) 



logitE{Y(0)|Z} « (3 {((3 X - rr<} 



+(/3 x -r)m-rr<Mx\z) 

+ p z {(p x -^fa 2 x }Z, 

where for any parameter /3 and variance compo- 
nent a 2 , we define j3{a 2 } = f3(c 2 a 2 + l) -1 / 2 with c = 
W3/157T. It can relatively easily be deduced from 
these expressions and the fact that E{Y(0)|Z} = 
E{Y(0)} that 



i og itE(y|z)«$ + 



r 



:E(X\Z), 



for some f3' Q , suggesting increasing bias with increas- 
ing association between X and Y (given Z) and 
with increasing residual variance in X (given Z). 
This is true except at the null hypothesis of no 



causal effect because Y _LL Z at the null hypothe- 
sis so that the usual maximum likelihood estimator 
of tp in model (13) will then converge to in proba- 
bility. Further, note that the standard IV estimator 
requires correct specification of the first stage re- 
gression model for the expected exposure (Didelez, 
Meng and Sheehan, 2010; Rassen et al., 2009; Hen- 
neman, van der Laan and Hubbard, 2002). In spite 
of its approximate nature, the standard IV estimator 
continues to be much used in Mendelian randomiza- 
tion studies because of its simplicity, because it can 
be used in meta-analyses of summary statistics, even 
when information on ORy|^ and A x \z is obtained 
from different studies (Minelli et al., 2004; Smith 
et al., 2005; Bowden et al., 2006), and because the 
underlying principle extends to case-control studies 
when the first stage regression is evaluated on the 
controls and the disease prevalence is low (Smith 
et al., 2005; Bowden and Vansteelandt, 2011). For 
relative risk estimators, the resulting bias due to 
basing the first stage regression on controls rather 
than a random population sample amounts to the 
difference between the log relative risk and the log 
odds ratio between Y and Z, inflated by the recip- 
rocal of the exposure distortion Ax\z (Bowden and 
Vansteelandt, 2011). 

The bias of the standard IV estimator can someti- 
mes be attenuated by including the first-stage resi- 
dual R = X — X as an additional regressor to X in 
model (13). This is known as the control functions ap- 
proach in econometrics (Smith and Blundell, 1986; 
Rivers and Vuong, 1988) and has also been conside- 
red in the biostatistical literature on noncompliance 
adjustment (Nagelkerke et al., 2000) and Mendelian 
randomization (Palmer et al., 2008). A control func- 
tion refers to a random variable conditioning on 
which renders the exposure independent of the un- 
measured variables that confound the association 
between exposure and outcome. Intuitively, the re- 
gression residual R may apply as a control function 
because it captures (part of) those confounders. In 
particular, let us summarize (without loss of gen- 
erality) all confounders of the exposure effect into 
a scalar measurement U. Assume that the contri- 
butions of the instrument Z and confounder U are 
additive in the sense that X = h{Z) + U for some 
function h. Suppose for simplicity that there are no 
covariates and that the conditional mean E(X|Z) is 
known so that X = h(Z) (here we use that U _LL Z, 
as implied by the IV assumptions). Then R= U so 
that a (correctly specified) logistic regression of Y 
on X and R (or, equivalently, X and R) will yield 
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a consistent estimator of the conditional causal odds 
ratio (1), which is here identical to (4) because U is 
completely determined by X and Z. More generally, 
following the lines of Smith and Blundell (1986), 
assume that X = h(Z) + V, U = p~*V + e, where e 
follows a standard logistic distribution, and that 
Y (x) = 1 if and only if /3q + ip*x + U > for some Pq , 
P*{. Then it also follows that Y = 1 if and only if 
e > — (3q — ip*X — p~{V, from which 

logit E(Y \X, V) = /3 * + ij*X + PIV. 

Upon substituting V with the estimated regression 
residual R, one obtains an estimator exp(ip) which 
consistently estimates the conditional causal odds 
ratio (1). In the Appendix we demonstrate that this 
is also a consistent estimator of the conditional causal 
odds ratio (4) when the exposure is normally dis- 
tributed with constant variance, conditional on the 
instrument, but not necessarily otherwise. Standard 
error calculation for the standard and adjusted IV 
estimators is also detailed in the Appendix. 

Over recent years, semiparametric analogs to the 
adjusted IV approach have been developed in the 
econometrics literature to alleviate concerns about 
model misspecification. Blundell and Powell (2004) 
and Rothe (2009), for instance, avoid parametric re- 
strictions on the conditional expectations E(X|Z, C) 
and E(Y\X, Z,C) (and, in particular, on the dis- 
tribution of e) by using kernel regression estima- 
tors and semiparametric maximum likelihood esti- 
mation, respectively. Imbens and Newey (2009) al- 
low for the contributions of the instrument Z and 
confounder U on the exposure to be nonadditive by 
extending the previous works to nonseparable ex- 
posure models of the form X = h(Z,U,C) for some 
function h. They show that the association between 
exposure and outcome is unconfounded upon adjust- 
ing for R = Fx\z,c(X\Z,C) as a control function, 
where F x \z,c is the conditional cumulative distribu- 
tion function of X, given Z and C. To avoid para- 
metric restrictions on the conditional expectations 
F X \Z,c{ x \Z,C) and E(Y\X,Z,C), they base infer- 
ence on local linear regression estimators. 

A limitation of all these semiparametric approaches 
is that, by avoiding assumptions on the distribution 
of e, the causal parameter ip* becomes difficult to 
interpret so that it may be exclusively of interest 
for the calculation of marginal causal odds ratios 
(see Section 3). A further limitation is that all fore- 
going approaches require the exposure to be con- 
tinuously distributed (Rothe, 2009); some addition- 
ally require the IV to be continuously distributed 



(Imbens and Newey, 2009). In the next section we 
review direct approaches to the estimation of the 
conditional causal odds ratio (4) which do not rely 
on assumptions about the exposure distribution. 

2.2 Consistent Estimation 

Remember that, although Y may well depend on Z 
(in the presence of an exposure effect), the IV as- 
sumptions imply that Y(0) _LL Z\C. Vansteelandt 
and Goetghebeur (2003) make use of this to obtain 
a consistent estimator of ip* in model (8), which is 
chosen to make this independence happen. Because 
this is not possible without making additional para- 
metric modeling assumptions (Robins and Rotnitzky, 
2004), they model the expected observed outcome, 
conditional on the exposure and IV, for example, 

logit P(Y = 1\X, Z,C) 

(15) 

= I3* + p{x + P* 2 Z + %XZ + p%c, 

where /3q , P* , /3| , /3| and /3| are unknown scalar pa- 
rameters. More generally, one may postulate that 

(16) logit E(Y\X,Z,C) = m(X,Z,C;P*), 

where m(X, Z, C; (3) is a known function, smooth 
in j3, and (3* is an unknown finite-dimensional pa- 
rameter. An estimator f3 of j3* can be obtained using 
standard methods (e.g., using maximum likelihood 
estimation). Combining the causal model (8) with 
the so-called association model (16) yields a predic- 
tion for the counterfactual outcome Y(0) for each 
subject which, for given ip, equals 

H(iP, $) = expit{m(A, Z, C; p) - m(C; tp)X}, 

where expit(a) = exp(a)/{l + exp(a)}. Because 
E{Y{0)\Z,C) = E{Y(0)\C} under the IV assump- 
tions, the value of ip* can now be chosen as the 
value tjj which makes this mean independence hap- 
pen, once Y(0) is replaced by H(ip,p). When there 
are no covariates and the instrument Z is dichoto- 
mous, taking the values and 1, one thus chooses ip 
such that 

When also the exposure is dichotomous, then mo- 
del (15) is guaranteed to hold and a closed-form es- 
timator is obtained, as given in the Appendix. In 
most cases, the solution to (17) gives a unique esti- 
mator of the causal odds ratio, although multiple or 
no solutions are sometimes obtained when precision 
is limited due to small sample size or the outcome 
mean being close to or 1. This is illustrated in Fig- 
ure 1, which displays the left- and right-hand side 
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Fig. 1. Plot of the left- (solid) and right-hand side (dotted) of expression (17) as a function ofip. Top: simulated data set 
[Right: with fi%=0 in model (15)]; Bottom: data set analyzed in Section 5.1. 



of (17) in function of tp for 3 settings. The top 2 
panels are based on the same simulated data set. 
They show that 2 or no solutions can be obtained 
for the same data set, depending on whether the 
association model (16) includes an interaction be- 
tween exposure and instrument (left panel) or not 
(right panel). The bottom panel corresponds to the 
data analysis of Section 5.1, where a single solution 
was obtained. Our experience indicates that, when 
2 solutions are obtained, one of them corresponds 
to an effect size which is so large that it would be 
deemed unrealistic [and correspondingly yield unre- 
alistically small or large values of E{Y(0)}]. When 
no solutions are obtained, this can sometimes be re- 
solved by choosing a less parsimonious association 
model (as in Figure 1, top), but must be seen as an 
indication that information is very limited. In the 
simulation experiments of Section 4, a single solu- 
tion was always obtained, but convergence of the 
root-finding algorithm (nlm in R) was sometimes 
very dependent on the choice of an adequate start- 
ing value. 



For general instruments, a consistent point esti- 
mator of i/j* can be found by solving unbiased esti- 
mating equation 

n 

= Y^[d(Zi, d) - E{d(^,Ci)|a}] 

i=l 

(18) 

■[Hi(ip,f3) — E{Hi(ip,/3)\Ci}] 

for tp, where d(Zi,Ci) is an arbitrary function of Z{ 
and Ci, for example, d(Zi,Ci) = Zi (see Bowden and 
Vansteelandt, 2011, for choices that yield a semi- 
parametric efficient estimator ofip*). This thus leads 
to the following 2-stage approach: 

1. First fit the association model (16), for instance, 
using maximum likelihood estimation, and obtain 
an estimator (3 of /3* ; 

2. Next, solve equation (18) to obtain an estima- 
tor rp of ijj*. 

Corresponding R-code is available from the first au- 
thor's website (users.ugent.be/~svsteela). This app- 
roach is extended in Tan (2010) to enable estima- 
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tion of the treatment effect on the treated at the IV 
level z of exposure x, as defined in (2), thus avoiding 
conditioning on C. 

In the Appendix we show that when the associa- 
tion model includes an additive term in d(Zi,Ci) — 
E{d(Zi,Ci)\Ci} and is fitted using maximum likeli- 
hood estimation as in standard generalized linear mo- 
del software, then its solution is robust to misspecifi- 
cation of the association model (16) when ip* = 0. 
This means that a consistent estimator of ip* = 
is obtained, even when all models are misspecified. 
In the absence of covariates and with d(Zi,Ci) = Zi 
and E{d(Zi,Ci)\Ci} = Y^j=iZj/n, this is satisfied 
as soon as the association model includes an inter- 
cept and main effect in Zi [as in model (15)]. The 
proposed approach then yields a valid (Wald and 
score) test of the causal null hypothesis that ip* = 0, 
even when both models (8) and (16) are misspeci- 
fied. This property, which we refer to as a "local" ro- 
bustness property (Vansteelandt and Goetghebeur, 
2003), also guarantees that estimators of the causal 
odds ratio will have small bias under model mis- 
specification when the true exposure effect is close 
to, but not equal to, zero. 

A drawback of the parameterization by Vanstee- 
landt and Goetghebeur (2003) is that the association 
model may be uncongenial with the causal model. 
Specifically, given the observed data law f(X,Z\C) 
and the limiting value /?* of /3, there may be no value 
of the causal parameter ip for which E{H(ip, f3*)\Z, 
C} = E{H(ip, /3*)\C} over the entire support of Z 
and C . In the Appendix, we show that this may 
happen when parametric restrictions are imposed 
on the main effect of the instrumental variable in 
the association model (16), along with its interac- 
tion with covariates C, but not when that main ef- 
fect is left unrestricted. It follows that no congenial- 
ity problems arise in the common situation of a di- 
chotomous instrument and no covariates, so long as 
a main effect of the IV is included in the associa- 
tion model. This continues to be true for categorical 
IVs with more than 2 levels when dummy regressors 
are used for the instrument in the association model 
and there are no covariates. For general IVs, one 
may consider generalized additive association mod- 
els which leave the main effect of the IV unrestricted 
(apart from smoothness restrictions). 

Robins and Rotnitzky (2004) developed an alter- 
native approach for estimation of ip* in model (8), 
which guarantees a congenial parameterization by 
avoiding direct specification of an association model. 
They parameterize instead the selection-bias func- 



tion 

logitE{Y(0)|X, Z,C} 

(19) -logitE{V(0)|X = 0,Z,C} 

= q(X,Z,C; V *), 

where q(X, Z,C;rj) is a known function satisfying 
q(0, Z, C; rf) = 0, smooth in r], and rf is an unknown 
finite-dimensional parameter. That q(X, Z,C\rf) en- 
codes the degree of selection bias can be seen be- 
cause q(X, Z, C; rf*) =0 for all X implies that E{F(0)| 
X,Z,C} = E{Y(0)\Z,C} and thus implies that the 
association between exposure and outcome [more 
precisely, Y(0)] is unconfounded (conditional on Z 
and C). Relying on a parametric model for the con- 
ditional exposure distribution, f(X\Z, C) = f(X\Z, 
C;a*) (fitted using maximum likelihood inference, 
for instance), their approach involves the following 
iterative procedure. First, for each fixed ip (start- 
ing from an initial value ipo), maximum likelihood 
estimators f)(ip) and ui(ip) are computed for the pa- 
rameters rj* and co* indexing the implied association 
model 

P(Y = l\X,Z,C;^,rf*,oj*) 

(20) = expit{m(C;ip)X + q(X,Z,C;rf*) 

+ v(Z,C;rf*,UJ*)}, 

where v(Z, C; rf , co*) = logitE{V(0)|X = 0, Z, C} is 
the solution to the integral equation 

logitE{V(0)|C} =t(C;w*) 

(21) = J expit{q(X = x,Z,C;rj*) 

+ v(Z,C;r,*,u*)} 

• f(X = x\Z,C;a*) dx, 

where t(C;co) is a known function of C, smooth in lo, 
and where oj* is an unknown finite-dimensional pa- 
rameter. For the given estimators fjfy) and £j{ip), an 
estimator of ip is then obtained by solving a linear 
combination of the estimating equations (18) and es- 
timating equations for the parameters indexing the 
association model (20). Both these steps are then ite- 
rated until convergence of the estimator. In the Ap- 
pendix we suggest a somewhat simpler strategy which, 
nonetheless, also involves solving integral equations. 
Alternatively, one could focus on the switch relative 
risk of van der Laan, Hubbard and Jewell (2007), 
introduced in Section 1, to avoid the uncongeniality 
problems associated with the odds ratio. 
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An advantage of the approach of Robins and Rot- 
nitzky (2004) is that it guarantees that E{Y(Q)\Z, 
C} = E{Y(0)|C} for all Z and C, although only un- 
der correct specification of the law f(X\Z,C). Un- 
der the approach of Vansteelandt and Goetghebeur 
(2003), this is only guaranteed under congenial pa- 
rameterizations as suggested previously, but regard- 
less of whether a model for the law f(X\Z,C) is 
(correctly) specified. A further advantage is that it 
might possibly give somewhat more efficient estima- 
tors by fully exploiting the a priori knowledge that 
E{Y(0)|Z, C} = E{Y(0)|C} to estimate unknown pa- 
rameters [i.e., v(Z,C)] and by additionally relying 
on a model for the exposure distribution. A draw- 
back is that the approach is computationally de- 
manding, especially for continuous IVs and /or in the 
presence of covariates, as it involves solving integral 
equations for each (Z, C) and this within each iter- 
ation of the algorithm. In addition, standard error 
calculations are more complex. A further drawback 
is that consistent estimation (away from the null) 
requires correct specification of the conditional ex- 
posure distribution f(X\Z, C). 

The estimation procedure for logistic structural 
mean models simplifies when the logit link is re- 
placed with the probit link and the exposure is as- 
sumed to be normally distributed conditional on 
the instrumental variable and covariates (with mean 
Op + a\ Z + a\C and constant standard deviation a* , 
where olq,ol\,o* are unknown). For instance, com- 
bining the probit structural mean model 

$~ 1 {E(y|x, z, c)} - $~ 1 {E(y(o)|x, z, c)} 

(22) 

= <px, 

where is the probit link and (f>* is unknown, 
with the probit association model 

(23) $~ 1 {E(y \x, z, c)} = e* G + e*x + e* 2 z + etc, 

where 9q, 9*, 6% are unknown, and averaging over the 
exposure, conditional on Z and C (see the Appendix), 
gives 

E{Y(0)\Z,C} 
= ${(9* + 9* 2 Z 

+ {p{ - f)(a* + a\Z + a* 2 C) + 9* 3 C) 



(24) 



■(^l + ^-^V 2 *)- 1 }. 

Because this does not depend on Z under the IV as- 
sumptions, it follows that 9\ = ((f)* — 9\)a\. Averag- 
ing over the exposure in the association model (23) 



and using the previous identity, we obtain 

This suggests regressing the outcome on the instru- 
mental variable and covariate using the probit re- 
gression model 

(25) $~ 1 {E(Y\Z,C)} = \* + \* 1 Z + \*.C 

to obtain an estimate Ai for the unknown regression 
slope A*, and then estimating <fi* as 



(26) 



2^2 



We will refer to this estimator as the "Probit-Normal 
SMM estimator" throughout. It is related to the in- 
strumental variables probit (Lee, 1981) and the gen- 
eralized two-stage simultaneous probit (Amemiya, 
1978), both of which instead infer effect estimates 
conditional on the unmeasured confounder U. When 
the outcome mean lies between 10% and 90%, the 
above estimator yields an approximate estimate of 
the causal odds ratio through the identity exp (-(/>*) ~ 
exp(</>*/0.6071) (McCullagh and Nelder, 1989). For 
dichotomous exposures, related estimators can be 
obtained via probit structural equation models that 
replace the linear regression model for Xi in assump- 
tion 1 above, with a probit regression model (see, 
e.g., Rassen et al., 2009). 

3. IV ESTIMATION OF THE MARGINAL 
CAUSAL ODDS RATIO 

We will now turn attention to the identification of 
marginal causal effects. Under linear structural mod- 
els, these coincide with conditional causal effects un- 
der typical assumptions (Hernan and Robins, 2006). 
Consider, for instance, the extended linear struc- 
tural mean model which imposes the restriction 

E{Y- Y(x)\X, C, Z}=m(C,x;ij*)(X -x) 

for each feasible exposure level x, where m(C,x;ip) 
is a known function (e.g., ipo + ipiC + ip2%), smooth 
in ip, and tp* an unknown finite dimensional param- 
eter. Then it follows from the restriction 

E{Y - m(C, x- ip*)(X - x)\C, Z} 

= E{Y - m(C, x; if>*)(X - x)\C} 

for each x, that 

E{Y -m(C,x;ip*)X\C, Z} 

= E{Y - m{C,x;^j*)X\C} 
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for each x, and thus that m(C,x;ip*) does not de- 
pend on x. This then implies that the marginal cau- 
sal effect equals 

E{Y(x*) - Y(x)\C} = m(C, 0; ^*){x* - x). 

Unfortunately, this result does not extend to logis- 
tic structural mean models, so that the conditional 
causal odds ratio corresponding to a single reference 
exposure level (e.g., 0) does not uniquely map into 
the marginal causal odds ratio. 

Let us therefore assume that in addition to the 
association model (16), the extended logistic struc- 
tural mean model holds, which we define by the re- 
striction 

odds(Y = l\X, Z, C) 
odds{Y(x) = l\X, Z,C\ 

(27) 

= exp{m(C;ip x )(X - x)}, 

for each feasible exposure level x, where m(C]ipx) is 
a known function (e.g., ip X Q + ip x iC), smooth in ip x , 
and ip* an unknown finite-dimensional parameter. 
The marginal causal odds ratio (6) can now be iden- 
tified upon noting that 

P{Y(x) = 1} 

= E[expit{m(X, Z, C;(3*)- m(C; ip*)(X - x)}] 

and the marginal causal odds ratio [(7), left] upon 
noting that 

P{Y(X + 1) = 1} 

= E[expit{m(X, Z, C; p) + m(C; Vx+i)}]- 

A consistent estimator of (6) is thus obtained by first 
obtaining consistent estimators of f3*,ip* and V'x+i' 
using the strategy of the previous section, and then 
calculating p x +i{l -p x )/{PxO- ~Px+i)}, where for 
given x 

n 

Px = rT 1 ^] expitjmpQ, Zj, Q; ft) 

i=l 

- m(Ci;7p x )(Xi - x)}. 

A consistent estimator of [(7), left] is obtained by 
first obtaining consistent estimators of ft* and V'x+i 
for each observed value for x using the strategy of 
the previous section, and then calculating px+i{l — 
Px)/{px0--Px+i)}, where 
n 

Px = n~ 1 ^y i , 

i=l 



n 

px+i = n" 1 expit{m(Xj, Z h Q; 0) 
i=i 

+ m(C i ;i> Xi +i)}- 

Standard error calculations are reported in the Ap- 
pendix. Using the above expressions, also estima- 
tors of the marginal risk difference P{Y(x + 1) = 
1} - P{Y(x) = 1} or relative risk P{Y(x + 1) = 1}/ 
P{Y(x) = 1} can straightforwardly be obtained. 

A drawback of this strategy, which we discuss in 
the Appendix, is that even when model (27) is con- 
genial with the association model (16) for x = (or 
some other reference level), it need not be a well- 
specified model for all x. We conjecture that when 
this would happen, this may be partially detectable 
in the sense of yielding estimating equations with 
no solution, as the uncongeniality is then due to the 
nonexistence of a value of ip x for some x so that 
E{Y(x)\Z,C} = E{Y(x)\C} for all (Z,C). As with 
other causal models that are not guaranteed to be 
congenial (e.g., Petersen et al., 2007; Tan, 2010) and 
as confirmed in simulation studies in the next sec- 
tion, we believe this is unlikely to induce an im- 
portant bias. The concern for bias is further allevi- 
ated by the aforementioned local robustness prop- 
erty, which continues to hold for extended logistic 
structural mean models. 

The idea of using conditional causal effect esti- 
mates as plug-in estimates in inference for marginal 
effects has been advocated in the biostatistical and 
epidemiological literature (see, e.g., Greenland, 1987; 
Ten Have et al., 2003) and is commonly employed in 
the econometrics literature (see, e.g., Blundell and 
Powell, 2004; Imbens and Newey, 2009), where re- 
lated proposals have been made starting from a semi- 
parametric control functions approach. Alternative 
approaches involve assuming that all confounders of 
the exposure effect can be captured into a scalar 
variate U, which has an additive effect on the out- 
come (Amemiya, 1974; Foster, 1997; Johnston et al., 
2008; Rassen et al., 2009) in the sense that 

(28) E(Y\X, C, U) = expit(/3 * + ^*X + /3*C) + U, 

where /3q, fl\ , tjj* are unknown and where E(U\C) = 0; 
note that E(U\X, C)^0 when there is confounding. 
Because, for each x, Y{x) _LL X\U, C, model (28) im- 
plies the marginal structural model 

E{Y{x)\C} = E[E{Y(x)\X = x, C, U}\C] 

= expit(/3^ + ^*x + /3 1 *C) 
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considered by Henneman, van der Laan and Hubbard 
(2002). This clarifies that exp(^*) in model (28) can 
be interpreted as the marginal (i.e., population av- 
eraged) causal odds ratio 

c:W-*) - ° dds{F(1) = m 

expW j "odds{y(o) = i|c}- 

Using that Z _LL U\C under the IV assumptions, an 
estimator ip for ip* can be obtained by solving the 
following unbiased estimating equations: 

(29) = J2 ( Z i | i Y i ~ expit(A) + + Pid)}. 

The marginal causal odds ratio (6) can be identified 
upon noting that 

P{Y(x) = 1} = E[expit(/3 * + 4>*x + #C)]; 

it equals exp(^*) when C is empty. The marginal 
causal odds ratio [(7), left] can be identified upon 
noting that 

Y>{Y{X + !) = !} 

= E[expit{/3 * + 4>*(X + 1) + #C}]. 

In the absence of covariates, it follows from the unbi- 
asedness of the estimating functions at ip* = that 
the resulting estimator is (locally) robust against 
model misspecification at the null hypothesis of no 
causal effect. However, it is not guaranteed to ex- 
ist and may be inconsistent for ip* / because the 
dichotomous nature of the outcome imposes strong 
restrictions on the distribution of U, which may be 
impossible to reconcile with the basic assumption 
that Z _LL U\C (Henneman, van der Laan and Hub- 
bard, 2002). 

4. SIMULATION STUDY 

We conducted 5 simulation experiments, each with 
a sample size of 1,000 and with 1,000 simulation 
runs. As in Palmer et al. (2008), the instrumental 
variable Z was generated in such a manner as to 
represent the number of copies (0, 1 or 2) of a single 
bi-allelic SNP in the Hardy- Weinberg equilibrium. 
The underlying allele frequency in the population 
was assumed to be p = 0.3, and so Z was generated 
from a multinomial distribution with cell probabil- 
ities (0.09, 0.42, 0.49). The exposure X was gen- 
erated to be N(Z, 2) in simulation experiments a, b 
and e, Z + t2 in simulation experiment c and T(Z, 1) 
in simulation experiment d [with T(-, •) referring to 
the Gamma distribution]. Finally, the outcome was 



generated to satisfy 

P(Y = 1\X,Z) = expit(/5 + P X X + p z Z), 

where (3q was fixed at different values to result in 
outcome means of 0.05, 0.1, 0.25 and 0.5 and j3 x was 
chosen to yield Y{0) 1L Z under the logistic struc- 
tural mean model (9) with ip equaling or 1. Finally, 
p3 z was set to 1 in simulation experiments a and e, 
to 2 in simulation experiments b and c and to —2 in 
simulation experiment d to correspond to different 
degrees of unmeasured confounding. Indeed, note 
that the conditional association p3 z between Y(0) 
and Z, conditional on X, is largely explained by 
the extent of unmeasured confounding. 

Table 1 compares the Wald estimator, the Ad- 
justed IV estimator and the logistic structural mean 
model estimator of the conditional causal log odds 
ratio. We do not report results for the semiparamet- 
ric control function approaches since these require 
the IV to be continuously distributed (Imbens and 
Newey, 2009). Table 1 demonstrates that the Wald 
estimator can have substantial bias when there is 
unmeasured confounding of the exposure-outcome 
association (cf. experiment b). As predicted by the 
theory, the adjusted IV estimator gives unbiased es- 
timators when the exposure has a symmetric distri- 
bution with constant variance (cf . experiments a-c) , 
conditional on the IV, but not when the exposure 
distribution is skewed (cf. experiment d) or when 
an exposure-IV interaction is ignored (cf. experi- 
ment e). Note, in particular, that the adjusted IV 
estimator is not locally robust to model misspecifica- 
tion at the causal null hypothesis ip* = 0, despite the 
existence of an asymptotically distribution- free test. 
The logistic SMM estimator is unbiased in all cases. 
It has slightly increased variance relative to the Ad- 
justed IV estimator when the exposure is normally 
distributed, but reduced variance when the exposure 
is t-distributed because of outlying exposure resid- 
uals (i.e., control functions) affecting the Adjusted 
IV estimator. 

Table 2 compares the proposed estimators of the 
marginal log odds ratio (6) (labeled "MLOR 1") 
and (7) (labeled "MLOR 2"), as well as the same es- 
timators where, for computational convenience, ip x 
is substituted with ipo for all x (labeled "Approx. 
MLOR 1" and "Approx. MLOR 2"). We do not 
report results on the estimators obtained by solv- 
ing (29) since they were doing very poorly, often 
resulting in nonconvergence in over 80% of the sim- 
ulation runs. Table 2 demonstrates that the approx- 
imate estimators perform adequately and much like 



12 VANSTEELANDT, BOWDEN, BABANEZHAD AND GOETGHEBEUR 

Table 1 

Bias (xlOO), empirical standard deviation (xlOO) (ESE), average sandwich standard error (xlOO) (SSE) and coverage of 
95% confidence intervals ( Cov.) for the standard IV estimator, the adjusted IV estimator and the logistic structural mean 

model estimator of the log conditional causal odds ratio 



Standard IV Adjusted IV Logistic SMM 



J!/Xp. 


t<i(Y ) 




Bias 


per 




VjOV. 


jjias 






LOv. 


Bias 








a 


0.1 





1.15 


16.2 


15.9 


95.5 


l.n 


19.2 


18.9 


95.1 


1.62 


20.1 


19.6 


95.6 




0.05 


1 


3.82 


30.8 


30.4 


96.1 


3.92 


30.8 


30.5 


96.0 


5.31 


33.0 


32.2 


96.2 




0.1 


1 


1.71 


22.0 


21.9 


95.3 


1.80 


22.0 


21.9 


95.5 


2.71 


23.6 


23.0 


95.6 




0.25 


1 


0.68 


15.0 


15.0 


95.5 


0.77 


15.0 


15.1 


95.6 


1.24 


15.8 


15.7 


95.3 




0.5 


1 


1.18 


12.3 


12.7 


95.1 


1.28 


12.3 


12.7 


95.3 


1.46 


12.6 


13.0 


95.6 


b 


0.1 





1.28 


15.7 


15.9 


95.1 


1.31 


24.8 


25.1 


95.5 


2.86 


28.3 


28.3 


95.9 




0.05 


1 


-7.12 


31.1 


27.9 


88.9 


4.38 


34.4 


33.4 


95.3 


6.63 


38.7 


37.3 


95.1 




0.1 


1 


-13.5 


22.1 


18.9 


80.1 


2.69 


25.4 


25.7 


95.3 


4.37 


29.0 


28.9 


95.2 




0.25 


1 


-21.9 


15.3 


11.6 


49.2 


1.84 


19.8 


20.1 


95.1 


2.76 


22.1 


22.2 


95.8 




0.5 


1 


-26.0 


13.2 


8.89 


26.5 


1.28 


18.0 


18.3 


95.4 


1.65 


19.3 


19.4 


95.4 


c 


0.1 





1.77 


17.0 


17.1 


95.0 


7.06 


73.5 


61.3 


94.4 


5.31 


39.8 


39.5 


95.2 




0.05 


1 


-34.8 


36.1 


30.4 


55.4 


10.8 


79.8 


69.4 


94.4 


12.2 


58.0 


56.2 


93.1 




0.1 


1 


-29.1 


34.9 


26.6 


50.5 


9.82 


83.0 


63.5 


94.8 


8.15 


41.1 


39.7 


95.9 




0.25 


1 


-25.6 


30.3 


21.2 


41.9 


7.45 


68.3 


54.2 


93.1 


3.50 


26.9 


25.2 


95.1 




0.5 


1 


-24.7 


26.8 


18.9 


39.2 


7.23 


66.9 


53.1 


93.8 


1.8 


19.7 


19.0 


95.3 


d 


0.1 





0.08 


15.6 


15.8 


95.3 


-56.2 


25.6 


26.2 


40.8 


-1.03 


28.6 


28.5 


94.0 




0.05 


1 


-48.0 


24.1 


26.2 


51.7 


-91.8 


47.0 


43.5 


42.4 


-1.09 


40.0 


34.1 


87.7 




0.1 


1 


-55.8 


15.8 


19.0 


14.3 


-83.4 


32.3 


31.9 


22.3 


1.16 


33.5 


31.6 


88.2 




0.25 


1 


-65.2 


9.87 


13.1 


0.00 


-61.8 


23.3 


23.1 


21.2 


1.59 


26.8 


27.0 


94.0 




0.5 


1 


-72.8 


8.53 


10.8 


0.00 


-27.0 


19.9 


20.3 


76.3 


-0.07 


27.3 


28.5 


95.2 


c 


0.1 





2.55 


15.5 


15.4 


94.8 


2.83 


18.6 


19.2 


95.9 


3.25 


26.8 


26.4 


97.2 




0.05 


1 


-37.7 


25.8 


25.2 


62.2 


-37.4 


26.0 


25.8 


64.0 


13.4 


56.3 


52.4 


91.0 




0.1 


1 


-36.6 


18.4 


18.3 


45.0 


-36.4 


18.6 


18.9 


48.1 


8.38 


39.8 


38.0 


93.9 




0.25 


1 


-31.0 


12.7 


13.0 


34.3 


-30.9 


12.7 


13.2 


35.6 


4.83 


24.4 


24.3 


95.7 




0.5 


1 


-19.1 


10.7 


11.9 


61.8 


-18.7 


10.8 


11.4 


60.4 


4.18 


17.1 


17.4 


96.0 



the proposed estimators, although the nominal cov- 
erage level is slightly better attained for the pro- 
posed estimators. Given the good agreement, the 
results in Table 3 are based on the computationally 
more attractive approximate estimators. Interest- 
ingly, it reveals that the estimators of the marginal 
causal log odds ratio have a much reduced vari- 
ance relative to the three considered estimators of 
the conditional causal log odds ratio. In particu- 
lar, highly efficient estimates are obtained for the 
marginal causal log odds ratio (6) which we regard 
to be of most interest in many practical applications, 
since it essentially expresses the result that would be 
obtained in a randomized experiment. 

5. APPLICATIONS 

5.1 Analysis of a Health Register 

Brookhart et al. (2006) and Brookhart and Schnee- 
weiss (2007) assess short-term effects of Cox-2 treat- 



ment (as compared to nonsteroidal anti-inflammato- 
ry treatment) on the risk of gastrointestinal (GI) 
bleeding within 60 days. As Table 4 shows, of the 
37,842 new nonselective NSAID users drawn from 
a large population based cohort of medicare bene- 
ficiaries who were eligible for a state-run pharma- 
ceutical benefit plan, 26,407 patients were placed 
on Cox-2 treatment. Let the received treatment X 
equal 1 for subjects placed on Cox-2 and for those 
on nonselective NSAIDs. Let the outcome Y indi- 
cate 1 for upper gastrointestinal (GI) bleeding within 
60 days of initiating an NSAID and otherwise. As 
in Brookhart and Schneeweiss (2007), we use the 
physician's prescribing preference for Cox-2 (versus 
nonselective NSAIDs) Z as an instrumental variable 
for the effect of Cox-2 treatment on gastrointesti- 
nal bleeding. The Wald and adjusted IV estimator 
of the conditional causal odds ratio were found to 
be identical: 0.26 (95% confidence interval 0.084- 
0.79, P 0.018). In contrast, the logistic structural 
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Table 2 

Bias (xlOQ), empirical standard deviation (xlOO) (ESE), average sandwich standard error fx 100,) (SSE) and coverage of 
95% confidence intervals (Cov.) for the approximate and exact estimators of the logarithm of (6) (MLOR1) and the 

logarithm of (7) (leftmost) (MLOR2) 











Approx. MLOR 1 






MLOR 1 




I , .\ | J . 


iL,(r ) 


V 




ESE 


SSE 


Pnv 


— 


ESE 


SSE 




a 


0.1 





-0.10 


15.9 


15.6 


93.7 


-0.30 


15.7 


15.6 


94.9 




0.05 


1 


-0.31 


9.82 


9.79 


93.6 


-0.65 


9.54 


10.1 


96.0 




0.1 


1 


-1.01 


14.2 


14.3 


92.6 


-1.52 


14.0 


15.0 


95.2 




0.25 


1 


0.04 


6.50 


6.51 


94.7 


-0.16 


6.31 


6.52 


95.6 




0.5 


1 


0.32 


5.44 


5.56 


95.9 


0.24 


5.46 


5.50 


94.1 


1) 


n i 


n 
u 


— n 4Q 


15.5 


15.9 


94.1 




16.1 


16.1 


uo.u 




0.05 


1 


-0.04 


11.0 


10.8 


94.0 


-0.58 


10.7 


12.0 


96.4 




0.1 


1 


0.23 


8.29 


8.35 


94.2 


-0.24 


7.89 


8.90 


96.3 




0.25 


1 


0.18 


6.42 


6.49 


95.1 


-0.06 


6.12 


6.52 


96.1 




0.5 


1 


0.07 


5.80 


5.87 


95.8 


-0.04 


5.70 


5.80 


95.5 










Approx. MLOR 2 






MLOR 2 










Bias 


ESE 


SSE 


Cov. 


Bias 


ESE 


SSE 


Cov. 


a 


0.1 





1.2 


16.4 


16.1 


95.5 


1.14 


16.3 


16.0 


94.9 




0.05 


1 


1.57 


20.6 


20.2 


95.6 


1.04 


19.4 


23.9 


94.8 




0.1 


1 


4.00 


30.1 


29.5 


95.6 


3.06 


28.4 


28.1 


95.0 




0.25 


1 


0.24 


12.7 


12.7 


95.1 


0.00 


12.0 


14.0 


95.7 




0.5 


1 


0.29 


9.93 


10.3 


95.9 


0.25 


9.8 


10.1 


95.9 


b 


0.1 





1.46 


15.9 


16.1 


95.8 


1.28 


15.7 


15.9 


95.2 




0.05 


1 


3.74 


28.4 


28.7 


96.4 


2.97 


26.3 


26.7 


95.3 




0.1 


1 


2.41 


20.1 


21.0 


96.6 


1.72 


18.4 


19.2 


95.5 




0.25 


1 


1.24 


14.2 


15.1 


96.5 


0.80 


12.8 


13.9 


96.0 




0.5 


1 


0.58 


12.4 


13.3 


97.1 


0.52 


12.0 


13.1 


96.7 



mean model estimator [both using the approach of 
Vansteelandt and Goetghebeur (2003) and using the 
approach of Robins and Rotnitzky (2004)] was found 
to be 0.081 (95% confidence interval 0.0095-0.82, 
P 0.018), which might be more reliable, consider- 
ing the nonnormality of the exposure distribution. 
The marginal causal odds ratio was estimated to 
be almost identical: 0.083 (95% confidence interval 
0.0096-0.82). We thus estimate roughly that the use 
of nonselective NSAIDs instead of Cox-2 increases 
the odds (or risk) of gastrointestinal bleeding by at 
least 18% (= 1 - 0.82). 

Besides the IV assumptions, all results rely on the 
assumption that the effect of Cox-2 versus nonselec- 
tive NSAIDS is the same in Cox-2 users whose physi- 
cian prefers Cox-2 treatment as in Cox-2 users whose 
physician prefers nonselective NSAIDS (and likewise 
for the effect of nonselective NSAIDS). They are 
in stark contrast with the estimate obtained from 
an unadjusted logistic regression analysis: 1.12 (95% 
confidence interval 0.85-1.5). 



5.2 Analysis of Randomized Cholesterol 
Reduction Trial with Noncompliance 

We reanalyze the cholesterol reduction trial re- 
ported in Ten Have et al. (2003). Let Y be an in- 
dicator of treatment success (defined benefi- 
cial change in cholesterol), X be an indicator of 
using educational dietary home-based audio tapes 
(which equals on the control arm) and Z be the 
experimental assignment to the use of educational 
dietary home-based audio tapes. The Wald estima- 
tor of the conditional causal odds ratio was found to 
be 1.37 (95% confidence interval 0.68-2.74, P 0.38), 
and analogous to the logistic structural mean model 
estimator, 1.31 (95% confidence interval 0.72-2.40, 
P 0.37). This expresses that in patients who used 
the audio tapes on the intervention arm, the odds 
of a beneficial reduction in cholesterol would have 
been 1.31 times lower had they not received the 
intervention. The adjusted IV estimator was unin- 
formative: 0.020 (95% confidence interval 0-10 171 , 
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Table 3 

Bias (xlOO), empirical standard deviation (xlOO) (ESE), average sandwich standard error (xlOO) (ESE) and coverage of 
95% confidence intervals (Cov.) for the logistic structural mean model estimator of the log conditional causal odds ratio (4), 
the approximate estimator of the logarithm of (6) (MLOR1) and the logarithm of (7) (leftmost) (MLOR2) 



Logistic SMM MLOR 1 MLOR2 





>P 


Bias 




CGI? 




Bias 








oias 






Cov. 


0.1 





1.62 


20.1 


19.6 


95.6 


-0.10 


15.9 


15.6 


93.7 


1.23 


16.4 


16.1 


95.5 


0.05 


1 


5.31 


33.0 


32.2 


96.2 


-0.31 


9.82 


9.79 


93.6 


1.57 


20.6 


20.2 


95.6 


0.1 


1 


2.71 


23.6 


23.0 


95.6 


-1.01 


14.2 


14.3 


92.6 


4.00 


30.1 


29.5 


95.6 


0.25 


1 


1.24 


15.8 


15.7 


95.3 


0.04 


6.50 


6.51 


94.7 


0.24 


12.7 


12.7 


95.1 


0.5 


1 


1.46 


12.6 


13.0 


95.6 


0.32 


5.44 


5.56 


95.9 


0.29 


9.93 


10.3 


95.9 


0.1 





2.86 


28.3 


28.3 


95.9 


-0.49 


15.5 


15.9 


94.1 


1.46 


15.9 


16.1 


95.8 


0.05 


1 


6.63 


38.7 


37.3 


95.1 


-0.04 


11.0 


10.8 


94.0 


3.74 


28.4 


28.7 


96.4 


0.1 


1 


4.37 


29.0 


28.9 


95.2 


0.23 


8.29 


8.35 


94.2 


2.41 


20.1 


21.0 


96.6 


0.25 


1 


2.76 


22.1 


22.2 


95.8 


0.18 


6.42 


6.49 


95.1 


1.24 


14.2 


15.1 


96.5 


0.5 


1 


1.65 


19.3 


19.4 


95.4 


0.07 


5.80 


5.87 


95.8 


0.58 


12.4 


13.3 


97.1 


0.1 


o 


5.31 


39.8 


39.5 


95.2 


0.85 


15.7 


15.7 


94.8 


2.47 


16.5 


16.5 


95.3 


0.05 


1 


12.2 


58.0 


56.2 


93.1 


1.30 


17.4 


17.0 


91.0 


7.10 


36.4 


36.4 


93.2 


0.1 





8.15 


41.1 


39.7 


95.9 


1.19 


13.1 


12.8 


92.9 


4.72 


26.7 


26.7 


95.5 


0.25 


1 


3.50 


26.9 


25.2 


95.1 


0.35 


9.24 


8.69 


93.9 


1.62 


17.2 


16.8 


95.6 


0.5 


1 


1.8 


19.7 


19.0 


95.3 


0.04 


6.80 


6.63 


95.2 


0.46 


12.1 


12.4 


96.1 


0.1 





-1.03 


28.6 


28.5 


94.0 


0.31 


21.3 


20.6 


92.9 


2.31 


21.6 


21.4 


97.0 


0.05 


1 


-1.09 


40.0 


34.1 


87.7 


-1.52 


22.0 


20.0 


84.3 


11.3 


52.4 


48.9 


90.0 


0.1 





1.16 


33.5 


31.6 


88.2 


0.17 


14.9 


14.8 


90.9 


6.78 


36.2 


35.0 


93.5 


0.25 


1 


1.59 


26.8 


27.0 


94.0 


1.00 


9.56 


9.79 


93.0 


3.45 


21.3 


21.4 


95.7 


0.5 


1 


-0.07 


27.3 


28.5 


95.2 


1.42 


7.36 


7.4 


92.9 


2.39 


13.8 


14.1 


95.6 


0.1 





3.25 


26.8 


26.4 


97.2 


1.66 


15.2 


15.2 


93.4 


-0.73 


15.1 


15.1 


93.9 


0.05 


1 


13.4 


56.3 


52.4 


91.0 


6.54 


41.8 


36.7 


82.5 


-1.93 


13.3 


11.5 


85.2 


0.1 





8.38 


39.8 


38.0 


93.9 


6.50 


33.5 


31.8 


87.2 


-0.38 


11.1 


10.7 


83.6 


0.25 


1 


4.83 


24.4 


24.3 


95.7 


2.88 


22.1 


22.4 


93.3 


0.46 


9.51 


9.64 


91.9 


0.5 


1 


4.18 


17.1 


17.4 


96.0 


-3.39 


19.3 


19.9 


94.4 


0.08 


11.3 


11.9 


95.1 



Table 4 

Observed data with Xi indicating received treatment [Cox-2 
(1) versus nonselective NSAIDs (0)], Zi indicating the 
physician's prescribing preference [Cox-2 (1) versus 
nonselective NSAIDs (0)J, and Yi indicating gastrointestinal 
( GI) bleeding (1) within 60 days of initiating an NSAID for 
subject i 





Zi = 


Zi = 


l 




Yi = O Yi = l 


Yi = 


Yi = l 


Xi = 


5640 39 


5722 


34 


Xi = l 


6740 60 


19493 


114 


P 0.99). 


The marginal causal 


odds ratio (6) 


was es- 



timated to be 1.28 (95% confidence interval 0.74- 
2.19, P 0.38). It expresses that, had all patients com- 
plied perfectly with their assigned treatment, the 
intention-to-treat analysis would have resulted in an 
odds ratio of 1.28. Since the exposure is dichoto- 
mous, the marginal causal odds ratio (7) is not of 



interest. Since subjects on the control arm have no 
access to the audio tapes, model (9) is only relevant 
for those who were assigned to the intervention arm 
(i.e., Z = 1); hence, this analysis does not rely on 
untestable assumptions regarding the absence of ex- 
posure effect modification by the instrumental vari- 
able. 

5.3 Analysis of Randomized Blood Pressure Trial 
With Noncompliance 

We reanalyze the blood pressure study reported in 
Vansteelandt and Goetghebeur (2003). Let Y be an 
indicator of successful blood pressure reduction, X 
measure the percentage of assigned active dose which 
was actually taken (which equals on the control 
arm) and Z be the experimental assignment to ac- 
tive treatment or placebo. The Wald and adjusted 
IV estimator of the conditional causal odds ratio 
were found to be identical, 4.29 (95% confidence in- 
terval 1.6-11.3, P 0.0032), and analogous to the lo- 
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gistic structural mean model estimator, 4.44 (95% 
confidence interval 1.6-12.6, P 0.0049). This expres- 
ses that in patients on the intervention arm with 
unit exposure per day, the odds of a beneficial reduc- 
tion in diastolic blood pressure would have been 4.44 
times lower had they not received the experimental 
treatment. The marginal causal odds ratio (6) was 
estimated to be 4.12 (95% confidence interval 1.6— 
10.3, P 0.0025). It expresses that, had all patients 
complied perfectly with their assigned treatment, 
the intention-to-treat analysis would have resulted 
in an odds ratio of 4.12. 

APPENDIX 

A.l Closed-Form Estimator 

When X and Z are both dichotomous, taking val- 
ues and 1, the logistic structural mean model es- 
timator is obtainable in closed form as 



ing the solution for ip, where Si(8,u,ip) equals 



Q!±jQi- 4Q 2 (Q 2 - X U +X 1Q )Q S 



2Q 2 



V> = log 
(30) 

where X xz is the percentage of subjects with X = x 
among those with Z = z, and 

Qi = (<5 2 + ^io)exp(/3 + /3i) 

+ (Q 2 - X n )exp0o + pi + p 2 + Ps), 

Q 2 = expit(/3 )Xoo - expit(/3 + P 2 )X 01 , 

Q 3 = exp(/3 + fa + h + $3) x exp(/3 + pi). 

A. 2 Standard Errors for Conditional Causal Log 
Odds Ratio Estimators 

Suppose that X satisfies the conditional mean mo- 
del 

E(X\Z,C)=g(Z,C;8*), 

where g(Z,C;8) is a known function, smooth in 8, 
and 8* is an unknown finite-dimensional parame- 
ter; for example, g{Z, C;0) = 9 + 8 X Z + 6 2 C. With 
R{8*) = X — g(Z, C; 8*), assume further that 

logit E(V|Z, C,R{8*)) 

= m (C, R(8*);oj*) + m(C; ^)g(Z, C; 8*), 

where 171q(C,R(6*);uj) is a known function, smooth 
in u, and u* is an unknown finite-dimensional pa- 
rameter; for example, mo(C, R(8*);uj) = + ujiC + 
ui 2 R{8*). Then the adjusted IV estimator is equiv- 
alently obtained by solving the multivariate score 
equation YJl=i Si(£) = for f = (8',u\ i/jj and tak- 



(31) 



f^(Zi,Ci;9) Var" 1 {X t Q)!^ (8) \ 



\ 



did 

(J 711 

^ — (C^UZ^-8) 

■ [Yi - expit{m (Ci, Ri(8);u}) 
+ m(Ci;ip) 

■g(Z u Ci;8)}] 



The asymptotic variance of the adjusted IV estima- 
tor can now be obtained from the "sandwich" ex- 
pression 

The asymptotic variance of the standard IV estima- 
tor is similarly obtained upon redefining mo(C, R(6*); 
u)) to be a function of only C and uj. The asymptotic 
variance of the logistic SMM-estimator is obtained 
as in Vansteelandt and Goetghebeur (2003). 

A. 3 Theoretical Comparison of the Adjusted IV 
Estimator and the Logistic Structural Mean 
Model Estimator 

To simplify the exposition, suppose that there are 
no covariates. Assume that X is normally distributed, 
conditional on Z. Let the adjusted IV estimator be 
based on the model 

logit P(Y = 1\R, Z)=ujq + ojiR + lo 2 E(X\Z), 

and assume, for the purpose of comparability, that 
this is also the association model underlying the lo- 
gistic structural mean model estimator [e.g., when 
E(X|Z) is linear in Z, then this is equivalent with 
a standard logistic regression model with main ef- 
fects in X and Z\. Under model (9), it then follows 
that 

logit p(y(o) = l\X, Z) 

= uj + (wi - ijj)R + (oj 2 - 1))E(X\Z). 

We will now demonstrate that the adjusted IV es- 
timator uj 2 is a consistent estimator of the causal 
parameter ip* indexing the logistic structural mean 
model. We will do so by demonstrating that the es- 
timating equations for the logistic structural mean 
model estimator ijj have mean zero at ip = uj 2 . 

Note that, at u 2 = i/j, logit P(V(0) = 1\X,Z) = 
ujq + (ui — ip)R. A Taylor series expansion of the 
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estimating function for ip, that is, 

[d(Z) - E{d(Z)}} expit[u + (wi - ^){X - E(X\Z)}}, 

around X = E(X\Z) then gives 

oo 

Y J [d(Z)-E{d(Z)}]{X-E(X\Z)} k 

k=0 

where expit( fc ) (wo) refers to the kth. order deriva- 
tive of expit(wo) w.r.t. loq. When X is normally 
distributed, conditional on Z, with constant vari- 
ance, then this is a mean zero equation because then 
E[{X - E(X\Z)} k \Z] = E[{X - E(X\Z)} k ] for all k. 
It thus follows that ui2 is a consistent estimator of 
the causal parameter ip*. This result continues to 
hold for other distributions than the normal, which 
satisfy that for each k, either E[{X -E(X\Z)} k \Z] = 
E[{X -E(X\Z)} k ] or expit (fc) (w ) = 0. For instance, 
when X is normally distributed, conditional on Z, 
with variance depending on Z and when, in addi- 
tion, expit(u;o) = 1/2, then ui2 stays a consistent es- 
timator of the causal parameter ip* because E[{X — 
E{X\Z)} k \Z] = E[{X - E(X\Z)} k ] for all k / 2 and 
expit^ 2 ) (ujq) = expit(a;o){l — expit(wo)}{l — 
2expit(wo)} = 0. 

A. 4 Local Robustness 

Suppose first that Cj is empty, d{Zi,Ci) = Zi and 
E{d(Zi,Ci)\Ci} = J2]=i z j/ n - when V>* = 0, then 

n E™ % 

equation (18) becomes Yl?=i( Z i ~ — 3= n 3 ) ' 
expit{m(Xi, Zi] f3)} . Suppose now that the associ- 
ation model includes an intercept and main effect 
in Zi, and that f3 is the standard maximum likeli- 
hood estimator of f3*. We then show that equation 

(18) equals £™ =1 (^ - ^±L-L)Y h which has mean 
zero at tp* = 0, even under model misspecification. 
That this equality is true follows because (3 satisfies 
the following score equations: 



-Si 

i=i v 



[Yi-expitimiXuZi-J)}} 



from which YJi=i Z % Y % = Ta=i z i expit{m(Xj, Zf, /3)} 
and 



n \-Ml ry 



E 

i=l 



Y; 



n sr^n ry 



E 

i=l 



V 



expit{m(Xi,Zf,/3)}. 



Extending this argument, it is seen that local robust- 
ness is attained whenever the association model in- 
cludes an additive term in d(Zi, Ci) — E{d(Zi, Ci)\Ci} . 



A. 5 Uncongenial Models 

It follows from the parameterization of Robins and 
Rotnitzky (2004) that, for each law f(X\Z,C), the 
logistic structural mean model (8) is congenial with 
association models of the form 

P(Y = 1\X, Z, C) 

= expit{m(C; ip*)X + q{X, Z, C) + v(Z, C)} 

for each function q(X, Z, C) of (X, Z, C) satisfying 
9(0, Z,C) = for all Z, C, each function t(C) of C, 
and v(Z,C) solving 

t(C) = J expit{q(X = x, Z, C)+v(Z, C)} 

■f(X = x\Z, C)dx. 

It thus follows that, for each law f(X\Z,C), the lo- 
gistic structural mean model (8) is also congenial 
with association models of the form 

P(Y = 1\X, Z, C) 

(32) = expit{m(C;ip*)X + q(X,Z,C) 

+ t*(C) + v*{Z,C)} 

for each such function, each function t*{C) of C, and 
v*(Z,C) satisfying v*(0,C) = for all C and 

J expit{q(X = x,0,C) + t*(C)} 
■f{X = x\Z = 0,C)dx 
= J expit{q(X = x, Z, C) 



(33) 



+ t*{C) + v*(Z,C)} 

■f{X = x\Z, C)dx 

for each Z. Indeed, this follows upon defining t*(C) 
as the solution to 

t(C)= J expit{q(X = x,0,C) +t*(C)} 

■f(X = x\Z = 0,C)dx. 

It follows that a given association model is con- 
genial with the logistic structural mean model (8) 
when no restrictions are imposed on the function 
v*(Z,C), which encodes the main effect of Z, along 
with interactions with C. The above derivation also 
suggests an easier strategy for fitting the model of 
Robins and Rotnitzky (2004), whereby the associa- 
tion model is of the form (32) and integral equations 
of the form (33) are solved. 

Consider now the extended logistic SMM (27). 
Suppose that model (27) is congenial with the as- 
sociation model (16) for x = in the sense that for 
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the given /?*, there exists a value ipQ such that 
expit{m(X, Z, C; p*)-m(C; ^)X}f{X\Z, C) dX 



lor series expansion shows that 







does not depend on Z. Then it does not necessarily 
follow that there exists a value tp* for given x such 
that 

expit{m(X, Z,C\p) 

- m(C; ij>Z){X - x)}f(X\Z, C) dX 

does not depend on Z . Model (27) being congenial 
with the association model (16) for x = hence does 
not imply congeniality for all x. 

A. 6 Probit-Normal SMM Estimator 

We explain how to derive E(Y(0) | Z, C) under mod- 
els (22) and (23). Note that 

E{Y(0)\Z,X,C} 

= P(U < 6<o + 9{X + 9* 2 Z + QIC - cp*X), 

where U is a standard normally distributed variate, 
independent of (Z, X). Averaging over the exposure, 
conditional on Z and C, then yields 

E{Y(0)\Z,C} 



—= expit{m(Xi, Zi, Q;/3) 

+ m(C i ;'4> x )(x - Xi)} -p, a 

1 n 

—= V] expit{m(Xj , Z h Q; (3) 



i n 



+ m(Ci;ip x )(x - Xi)} - fi x 
d 

7T— expit{m(Xj ,Zi,d;(3) 
ott x 



E 



dU i: 



+ m(Ci;i/) x )(x - Xi)} 



Uix (fix 



/oo 
p(u + (<f>* - e$)x 
-oo 



de x 

- \fn{fi x - fi x ), 

where 8 X = (/3 T , ip x ) T and Ui x (6 x ) is the vector of 
estimating functions for 6 X , from which the influence 
function for fi x is 

expit{m(Xj, Zi,Cf, 0) + m(d; ip x )(x - Xi)} - fi x 

1 Ve[ d 



i=l 



dd x 



<e^ + e* 2 z + eic) dF(x\z, c), 

where F(X\Z,C) refers to the conditional distribu- 
tion of X, given Z and C. Define U* = U + {(jf - 
9\)X. Then, for normally distributed X with mean 
Oq + a*Z + a\C and constant variance a 2 * , condi- 
tional on Z and C, U* has a normal distribution From the Delta method, it then follows that the in- 
with mean ((/)* - 6\ )(a + a.\Z + a* 2 C) and variance fl ue nce function for fj is 



expit{m(Xi, Zi,Ci; P) 

+ m(Ci;if> x ){x - Xi)} 
dU ix (9 : 



d6 x 



U ix (8 x ). 



1 + 



l) 2 a 2 . Then 



1 



E{Y(0)\Z,C} 



dF{U*,X\Z, C), 



oo J — oo 



which is as given in (24). The conditional mean 
E(Y\Z,C) can be derived using similar arguments. 

A. 7 Standard Errors for Marginal Causal Log 
Odds Ratio Estimators 

Consider the marginal log odds ratio defined by 

Mi(l-Ato) 



Mi(l-Mi) 
• expit{m(Xi,Zi,Ci;/3) 



+ 



+ m(C i ;i> 1 )(l-X i )}- f i 1 
d 



i=l 



de l 



(34) 



rj = log ; 



Mo(l-Mi)' 

where fj, x = E[expit{m(X, Z,C;f3*) + m(C;ip*)(x — 
X)}] for x = 0, 1, and let the corresponding estima- 
tors be fj and fi x ,x = 0, 1, respectively. Then a Tay- 



ex.pit{m(Xi, Zi,d; /3) 



+ m(C i ;^ 1 )(l-X i )} 



■E" 



U, 
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1 



Mo(l — A*o) 



expit{m(Xi, Zi,d; f3) 

+ m(Ci;ip )(0-Xi)} - A*o 



+ 



fn ^ 



de 



expit{m(Xj,Zj,Cj;/3) 



■m(Ci','M(P-Xi)} 



E" 



d9 



U, 



The asymptotic variance of fj thus equals 1 over n 
times the variance of this influence function (where 
averages and variances can be replaced with sam- 
ple analogs, and population values with consistent 
estimators). 

Consider the marginal log odds ratio defined by (34) 
with the redefinitions 

Ml = E[expit{m(X, Z, C; /?* ) + m(C; ip*x+l)}} 

and fio = E(y). Then using similar arguments as 
before, we obtain that the influence function for fj is 



Ml(l-Mi) 

expit{m(Xj,Zj,Cj;/3) 

+ m(C i ;V'x l +i)} - Mi 



9 



Xi+l 



■expit{m(Xj,Zj,Ci;/3) 



+ m(C i ;V'x l +i)} 



/ g^jj+l(£xv+l) . 
E I — l^^+H^+i, 



90 



Xi+l 



Mo(l ~Mo) 



K-Mo]- 
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